119 research outputs found
When the present web is later the past: web historiography, digital history and internet studies
"Taking as point of departure that since the mid-1990s the web has been an essential medium within society as well as in academia this article addresses some fundamental questions related to web historiography, that is the writing of the history of the web. After a brief identification of some limitations within digital history and Internet studies vis-a-vis web historiography it is argued that the web is in itself an important historical source, and that special attention must be drawn to the web in web archives - termed reborn-digital material - since these sources will probably be the only web left for future historians. In line with this argument the remainder of the article discusses the following methodological issues: What characterizes the reborn-digital material in web archives, and how does this affect the historian's use of the material as well as the possible application of digital analytical tools on this kind of material?" (author's abstract
Web as data. Challenges and triumphs of creating and working with a derived web corpus
Ditte Laursen: Web as data. Challenges and triumphs of creating and working with a derived web corpus,
Aarhus Conference 2022, Monday 17 Octobe
Visit at the Royal Library and Netarkivet
Ditte Laursen: Visit at the Royal Library and Netarkivet,
Aarhus Conference 2022, Monday 17 Octobe
Developing Datasheets for Archived Web Datasets
Emily Maemura: Developing Datasheets for Archived Web Datasets,
Aarhus Conference 2022, Monday 17 Octobe
ArchiveSpark: Efficient Web Archive Access, Extraction and Derivation
Web archives are a valuable resource for researchers of various disciplines.
However, to use them as a scholarly source, researchers require a tool that
provides efficient access to Web archive data for extraction and derivation of
smaller datasets. Besides efficient access we identify five other objectives
based on practical researcher needs such as ease of use, extensibility and
reusability.
Towards these objectives we propose ArchiveSpark, a framework for efficient,
distributed Web archive processing that builds a research corpus by working on
existing and standardized data formats commonly held by Web archiving
institutions. Performance optimizations in ArchiveSpark, facilitated by the use
of a widely available metadata index, result in significant speed-ups of data
processing. Our benchmarks show that ArchiveSpark is faster than alternative
approaches without depending on any additional data stores while improving
usability by seamlessly integrating queries and derivations with external
tools.Comment: JCDL 2016, Newark, NJ, US
Bogen som medie
I “Bogen som medie”, giver Niels Brügger et rids af bogens historiske transformationer. Litteraturens medie, bogen, er ikke blot en historisk konstant,der som en transparent ramme giver liv til litteraturen. Snarere viser bogens historiske udvikling og kunstneriske forsøg med bogen som medie, at litteratur og medie ikke er adskilte fænomener
DIGITAL HISTORIE OG ARKIVERET WEB SOM HISTORISK KILDE
Digital historie og arkiveret web som historisk kildeInden for det seneste årti er mængden af digitalt lagrede data vokset eksplosivt, og i samme periode vokser mængden af født digitalt materiale som fx indhold på sociale medier og web. Fremtidens historikere skal bevæge sig rundt i et kildemæssigt landskab, hvor kilderne i stigende grad er digitale og i mange tilfælde kun digitale. Denne artikel argumenterer for, at alle digitale kilder ikke er ens, blot fordi de er digitale, hvilket fører til en grundlæggende skelnen mellem digitaliserede, født-digitale og genfødt-digitale kilder. Dernæst introduceres til én særlig type genfødt digitalt materiale, nemlig arkiveret web, der sammenlignes med digitaliserede avisarkiver. Endelig diskuteres det, hvilke konsekvenser det arkiverede webs særlige karakteristika har for dets brug som historisk kilde
- …